Rank | Count | Beginning |
---|---|---|
53 | 6847 | In |
15 | 2409 | De |
28 | 2106 | La |
23 | 1717 | El |
12 | 1439 | Nu |
84 | 1154 | Potrivit |
119 | 1072 | Pe |
6 | 1069 | Pentru |
37 | 1057 | Dupa |
188 | 949 | Si |
36 | 935 | Din |
67 | 894 | Daca |
159 | 890 | Cu |
5 | 889 | Este |
106 | 858 | O |
41 | 855 | Un |
101 | 723 | Am |
24 | 717 | A |
18 | 697 | Dar |
13 | 656 | Mai |
386 | 656 | Astfel, |
343 | 572 | Aceasta |
167 | 517 | Presedintele |
97 | 503 | Cei |
327 | 496 | Acesta |
17 | 471 | "Nu |
961 | 470 | Conform |
59 | 446 | Se |
78 | 403 | Iar |
678 | 398 | Pana |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV